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Complexes of physically interacting proteins are one of the fundamental functional units 
responsible for driving key biological mechanisms within the cell. Their identification is 
therefore necessary not only to understand complex formation but also the higher level 
organization of the cell. With the advent of "high-throughput" techniques in molecular 
biology, significant amount of physical interaction data has been catalogued from organ- 
isms such as yeast, which has in turn fueled computational approaches to systematically 
mine complexes from the network of physical interactions among proteins (PPI network) . 
In this survey, we review, classify and evaluate some of the key computational methods 
developed till date for the identification of protein complexes from PPI networks. We 
present two insightful taxonomies that reflect how these methods have evolved over 
the years towards improving automated complex prediction. We also discuss some open 
challenges facing accurate reconstruction of complexes, the crucial ones being presence 
of high proportion of errors and noise in current high-throughput datasets and some key 
aspects overlooked by current complex detection methods. We hope this review will not 
only help to condense the history of computational complex detection for easy reference, 
but also provide valuable insights to drive further research in this area. 

Keywords: Protein Complex Prediction; Protein Interaction Network; Sparse Complexes 

1. Introduction 

Most biological processes within the cell are carried out by proteins that physi- 
cally interact to form stoichiometrically stable complexes. Even in the relatively 
simple model organism Saccharomyces cerevisiae (budding yeast), these complexes 
are comprised of many subunits that work in a coherent fashion. These complexes 
interact with individual proteins or other complexes to form functional modules 
and pathways that drive the cellular machinery. Therefore, a faithful reconstruction 
of the entire set of complexes (the 'complexosome') from the physical interactions 
among proteins (the 'interactome') is essential to not only understand complex 
formations, but also the higher level cellular organization. 

Protein complexes constitute modular functional units within the network of 
physical interactions, the PPI networlffl From a biological perspective, this mod- 
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ularity is a result division of labor and of evolution to provide robustness against 
mutation and chemical attacks^. From a topological perspective, this modularity is 
a result of proteins within complexes being densely connected to each other than 
to the rest of the networlM 

Since the advent of "high-throughput" screening in molecular biology in the 
late 1990s and early 2000s, several techniques have been introduced to infer physi- 
cal interactions among proteins within organisms in a large-scale ( "genome- wide" ) 
manner. These have helped to catalogue significant amount of protein interactions in 
organisms such as yeast thereby fueling computational techniques to systematically 
mine and analyze such large-scale interaction data. In yeast, the Yeast two-hybrid 
(Y2H}pG3 Protein Complementation Assay and Tandem Affinity Purifi- 

cation followed by Mass Spectrometry (TAP-Msffi 8 | 9 | 10 | 

are some of the widely 

adopted experimental systems that have helped to identify a considerable fraction 
of physical interactions among proteins. However, even at the current 'state-of-the- 
art', these high-throughput techniques have been shown to p roduce considerable 
proportion of spurious (false positive) interactionJ2E2n2Il4[ Therefore, once the 
interactions are identified their qualities need to be first assessed to generate a 
reliable set of interactions that is deemed suitable for further mining and analy- 
sis. This process includes assigning each interaction a confidence score that typi- 
cally accounts for the biological variability and technical limitations of the exper- 
imental conditions, and ther efore reflects the reliability of the inferred interaction 
| 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | xhe interactions with confidence scores below a cer- 
tain threshold are discarded to build a reliable "cleaned-up" PPI network. This PPI 
network is then mined to identify groups of proteins potentially forming complexes. 
The whole process can be summarized in the following steps: 

(1) Integrating high-throughput datasets from multiple experiments and assessing 
the reliabilities of interactions; 

(2) Constructing a reliable PPI network; 

(3) Identifying modular subnetworks from the PPI network to generate a candidate 
list of complexes; 

(4) Evaluating the identified complexes against bona fide complexes, and validating 
and assigning roles to novel complexes. 

The identification of complexes from high-throughput interaction datasets has 
attracted considerable attention from both biologists as well as computational re- 
search communities, and over the years, several computational techniques have been 
developed to systematically identify complexes. Quite naturally, a number of sur- 
veys have come out from time-to-time evaluating and comparing these techniques 
for their performance on available PPI datasets. One of the earliest comprehen- 
sive evaluations was by Brohee and van Helden (2006 j25l. This was followed by 
Vlashblom et al. (2009^ and Li et al. (2010 j23. While Brohee and Vlasblom et al. 
evaluated and compared some early methods on PPI datasets available at that time 
(till 2006), Li et al. covered some of the more recent methods developed until 2009. 
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The purpose of our work is to provide an up-to-date survey, classification (taxon- 
omy) and evaluation of some the representative works done till date (2011/2012). 
We build upon the existing surveys so as to not repeat entirely the evaluations and 
conclusions already drawn, yet we provide our own classifications and evaluations 
of more recent techniques across up-to-date PPI datasets. We also compare across 
unscored (raw) and scored PPI datasets, which is missing in these existing surveys. 
We also highlight and comment on some of the newer challenges and open problems 
in complex prediction, which can guide future directions for research in this area. 



2. Review of existing methods for complex detection 

We begin by mentioning some definitions and terminologies widely adopted across 
the reviewed works. A PPI network is modeled as an undirected graph G — (V, E), 
where V is the set of proteins and E — {(u,v) : u, v £ V} is the set of interactions 
among protein pairs. For any protein v £V , N(v) is the set of direct neighbors of v, 
while deg{v) = \N(v)\ is the degree of v. The interaction density of G is defined as 
2 I E I 

density(G) — — . This is a real number between and 1, and typically 

quantifies the "richness of interactions" within G: for a network without any 
interactions and 1 for a fully connected network. The clustering coefficient CC(v) 

measures the "cliquishness" of the neighborhood of v. CC(v) = , „ w tt> 

\N(v)\.(\N(v)\-l)' 

where E(v) is the set of edges in the neighborhood of v. If the interactions of 
the network are reliability scored (weighted), these definitions can be extended to 
their corresponding weighted versions: deg w (v) = w(u,v), density w (G) = 

ueN(v) 

, and CC w (v) = *y. '—, r, where w : E x E — s>7^isa 



inavi-i); w \ ' iawoa^i - 1) 

scoring function on the interactions in E. There are several in tere sting variants 
proposed for weighted clustering coefficient CC W ; for a survey se< 



2.1. Taxonomy of existing methods 

Although at a very generic level most existing methods make the key assumption 
that complexes are embedded among densely-interacting groups of proteins within 
PPI networks, these methods vary considerably either in the algorithmic method- 
ologies or the kind of biological insights employed to detect complexes. Accordingly, 
we classified some popular complex detection methods into two broad categories (a 
soft classification) : (i) methods based solely on graph clustering; (ii) methods based 
on graph clustering and some additional biological insights. These biological insights 
may be in the form of functional, structural, organizational or evolutionary infor- 
mation known about complexes or their constituent proteins from experimental or 
other biological studies. 
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We present this classification in two snapshots. The first snapshot, shown in 
Figure [1] gives a chronology-based "bin-and-stack" classification, while the second 
snapshot, shown in Figure [2] gives a methodology-based "tree" classification of the 
methods. 




Fig. 1. The "Bin-and-Stack" classification: Chronological binning of complex detection methods 
based on biological information used. It is interesting to note that over the years, as researchers 
have tried to improve the basic graph clustering ideas, they have also incorporated biological 
information into their methods. 



In the chronology-based classification, we binned methods based on the years 
in which they were developed, and stacked them based on the kind of biological 
insights used (see Figure [1]). The biological insights are grouped as: core-attachment 
structure, evolutionary information, functional coherence, and mutually exclusive 
and co-operative interactions. It is interesting to note from this classification that, 
over the years, as researchers tried to improve the basic graph clustering ideas, they 
also incorporated a variety of biological information into their methods. 

In the methodology-based classification, we distributed the methods to different 
branches of a classification tree based on the kind of computational strategy used 
(see Figure [2]). At the first level from the root, we grouped these methods into 
those based solely on graph clustering, and those employing additional biological 
insights. At subsequent levels, we further divided these methods based on the kind 
of algorithmic strategies used, into: (i) methods employing merging or growing of 
clusters; (ii) methods employing repeated partitioning of networks; and (iii) methods 
employing network alignment. The methods employing merging or growing clusters 
go "bottom-up" , that is, typically start with small "seeds" (for example, triangles 
or cliques) , and repeatedly add or remove proteins or merge clusters based on some 
similarity measures to arrive at the final set of complexes. On the other hand, the 
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Fig. 2. The Tree classification: Classification of existing methods for complex detection based 
on the algorithmic methodologies used. Primarily three methodologies arc adopted: merging and 
growing clusters, network partitioning and network alignment. 



methods based on network partitioning go "top-down", that is, repeatedly partition 
or break the network into multiple subnetworks based on certain divisive criteria. 
The methods based on network alignment use multiple networks (typically from 
different species) to arrive at isomorphic regions that likely correspond to complexes, 
the inituition being that proteins belonging to real complexes should generally be 
conserved through the evolution process to act as an integrated functional uni P. 



2.2. Methods based solely on graph clustering 

Most methods that cluster the PPI network into multiple dense subnetworks make 
use of solely the topology of the network. 



Molecular COmplex DEtection (MCODE) 

MCODE, proposed by Bader and Hogue (2003 j2SI, is one of the first computational 
methods (and therefore, seminal) developed for complex detection from PPI net- 
works. The MCODE algorithm operates in mainly in two stages, vertex weighting 
and complex prediction, and an optional third stage for post-processing. 

In the first stage, each vertex v in the network G — (V, E) is weighted based on its 
neighborhood density. Instead of directly using clustering coefficient, MCODE uses 
core-clustering coefficient which measures the density of the highest fc-core in the 
neighborhood of v. This amplifies the weighting of densely connected regions in G. 
In the second stage, the vertex v with the highest weight is used to seed a complex. 
MCODE then recursively moves outwards from the seed vertex, including vertices 
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into the complex whose weight is a given percentage (vertex weight parameter 
- VWP) away from the seed vertex. A vertex once added to a complex is not 
checked subsequently. The process stops when there are no more vertices to be 
added to the complex, and is repeated using the next unseeded vertex. At the end 
of this process multiple non-overlapping complexes are generated. The optional third 
stage performs a post-processing on the complexes generated from the second stage. 
Complexes without 2-cores are filtered out, and new vertices in the neighborhood 
with weights higher than a given 'fluff' parameter are added to existing complexes. 
The resultant complexes are scored and ranked based on their densities. The time 
complexity of the algorithm is O ( | V | . | E \ . h 3 ) , where h is the vertex size of the average 
vertex neighbourhood in the network G. 



Markov CLustering (MCL) 

The Markov Clustering (MCL) algorithm, proposed by Stijn van Dongen (2000 jES 
is a general graph clustering algorithm that simulates random walks (called flow) 
to extract out relatively dense regions within networks. In biological applications, it 

lOI I 

was first applied to cluster protein families and ortholog groups^ befo re it prov ed 
to be effective in detecting complexes from protein interaction networki^^ES 

MCL manipulates the adjacency matrix of networks with two operators called 
expansion and inflation to control the random walks (flow) . Expansion models the 
spreading out of the flow, while inflation models the contraction of the flow, making 
it thicker in dense regions and thinner in sparse regions. These parameters boost the 
probabilities of intra-cluster walks and demote those of inter-cluster walks. Math- 
ematically, expansion coincides with normal matrix multiplication, while inflation 
is a Hadamard power followed by a diagonal scaling. Therefore, MCL is highly effi- 
cient and scalable. The iterative expansion and inflation separates the network into 
multiple non-overlapping regions. 



Clustering based on merging Maximal Cliques (CMC) 

CMC was proposed by Liu et al. (2009)^ to detect complexes from PPI net- 
works based on repeated merging of maximal cliques. Som e ea rlier algorithms like 
CFindejES 

and Local Clique Merging Algorithm 

(LCMA)P 

also adopted clique 

merging to find dense neighborhoods, but the distinct advantage of CMC over 
these algorithms is its ability to work on weighted networks and to find relatively 
low density regions (in subsequent improved versions of CMC). 

CMC begins by enumerating all maximal cliques in the PPI network using the 
Cliques algorithm proposed by Tomita et al.^l Although enumerating all maximal 
cliques is NP-hard, this does not pose a problem in PPI networks because these 
networks are usually sparse. CMC then assigns a score to each clique C based on 
its weighted density, which considers the reliabilities (weights) of the interactions 
within the clique: 
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Score{c) = iq.(iq-i) ■ (1) 

CMC then ranks these cliques in decreasing order of their scores and itera- 
tively merges or removes highly overlapping cliques based on their inter-connectivity 
scores. The inter-connectivity score of two cliques Cj and Cj is based on the non- 
overlapping regions of the two cliques and is defined as: 



Inter_score(C'i, Cj) 



\d Cj\.\Cj\ \Cj — d\.\d 



(2) 

CMC determines whether two cliques Ci and Cj sufficiently overlap: \Ci n 
Cj|/|Cj| > overlap .thresh. If so, Cj is either removed or merged with Ci based 
on the inter _score: if the inter _score(Ci,Cj) > mergedhresh, then Ci and Cj are 
merged, else Cj is removed. Finally, all the resultant merged clusters are output as 
the predicted complexes. 



Clustering with Overlapping Neighborhood Expansion (ClusterONE) 
Nepusz et al. (2012)21 

proposed ClusterONE, a method for detecting overlap- 
ping protein complexes from weighted PPI networks, based on seeding and greedy 

on 

growth, similar to MCODE^. ClusterONE uses a cohesiveness measure to deter- 
mine how likely a group of proteins form a complex, and is based on the weight of 
the interactions within the group and with the rest of the network. 

To begin with, ClusterONE identifies seed proteins and greedily grows them 
into groups with high cohesiveness. When the greedy growth for a group cannot 
progress any more, a next seed protein is selected to repeat the procedure until 
no more seed proteins remain. In the second step, ClusterONE identifies highly 
overlapping cohesive groups and merges them into potential complex candidates. 
ClusterONE allows identification of overlapping complexes if each of the merged 
groups represent individual complexes that share proteins. Nepusz et al.'s compar- 
isons with methods like MCODE, MCL and CMC showed that the complexes from 
ClusterONE achieved comparable accuracies when matched against known 'gold 
standard' complexes and MCL achieved the closest performance to ClusterONE 
with the exception that MCL produced only non-overlapping clusters - a distinct 
advantange of ClusterONE. 



Some other methods based on graph clustering 

Apart from these discussed methods, three other methods worth mentioning here 
are LCMA (2005^, PCP (2007;P3 an d HACO (2009^. The LCMA algorithm 
first locates cliques within local neighborhoods using vertex degrees and then merges 
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them based on overlaps to produce complexes. Protein Complex Prediction (PCP) 
uses FS Weight scoring to remove unreliable interactions and add indirect interac- 
tions, and then merges cliques to produce the final list of complexes. On the other 
hand, HACO uses hierarchical agglomerative clustering to produce the intial set of 
(non-overlapping) clusters. Proteins are then assigned to multiple clusters based on 
their interactions to the clusters to produce the final list of overlapping clusters. 

A few other recently proposed (2010 - 2011) methods include those by Zhang 
et al0H, M a e t alPH, Wang et alP^ and Chin et alPH. These use the property of 
"bridgeness" of cross-edges among clusters along with the internal connectivities to 
detect complexes. 



2.3. Methods incorporating core- attachment structure 

Gavin and colleagues (2006)^1 performed large-scale analysis of yeast complexes 
and found that the proteins with complexes were divided into two distinct groups, 
"cores" and "attachments" . The cores formed central functional units of complexes, 
while the attachment proteins aided these cores in performing their functions. Sev- 
eral computational methods were proposed to reconstruct complexes from PPI new- 
torks by capitalizing on thi s structural organization. 
Wu Min et al. (2009^ 

proposed the COACH method which reconstructs com- 
plexes in two stages - it identifies dense core regions, and subsequently includes 
proteins as attachments to these cores. Figure [3] summarizes how COACH identifies 
core and attachment proteins to build complexes. 




Fig. 3. The identification of core and attachment proteins in COACH: The cores are first identified 
based on vertex degrees in the neighborhood graphs. Attachment proteins are then appended to 
these cores to build the final complexes. 
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Leung et al. (2009pSl proposed the CORE method to identify protein cores 
within the PPI network. They defined the probability of two proteins pi and p 2 (of 
degrees d± and d,2, respectively) to belong to the same core using two main factors: 
whether the two proteins interact or not and the number of common neighbors 
m between them. The probability that pi and P2 have > i interactions and > m 
common neighbors is calculated under the null hypothesis that d\ edges connecting 
Pi and d 2 edges connecting p 2 are randomly assigned in the PPI network according 
to a uniform distribution. This probability is used to arrive at a p- value for whether 
Pi and P2 belong to the same core. Subsequently, CORE merges sets of core proteins 
of sizes two, three, etc. until further increase in size is not possible, to produce the 
final set of cores. CORE then scores and ranks the predicted cores based on the 
number of internal and external interactions in them. The attachments are added 
to these cores in a manner similar to COACH to produce the final set of complexes. 

Srihari et al. proposed MCL-CA (2009j23 anc } the improved (weighted) ver- 
sion MCL-CAw (20 1 1 j^Sl which identify complexes by refining clusters produced by 
the MCL algorithirJ21l23| 

by incorporating core- attachment structure. Essentially, 
MCL-CAw categorizes proteins within MCL clusters into "core" and "attachment" 
proteins based on their connectivities, and then selects only these categorized pro- 
teins into complexes while discarding the remaining "noisy" proteins. This enables 
MCL-CAw to "trim" the raw MCL clusters. Further, unlike CORE and COACH, 
the refinement in MCL-CAw capitalizes on reliability scores assigned to the inter- 
actions. Consequently, MCL-CAw reconstructs significantly higher number of 'gold 
standard' complexes (^30% higher) and with better accuracies compared to plain 
MCL. 

Chin et al. proposed the HUNTER algorithm which begins by gener- 

ating a module seed MS(v) for each node v in the PPI network. MS(v) is then 
pruned by removing vertices having low weight edges to other members of MS(v). 
Then the maximal g-connected subnetwork of MS(v) is selected as the initial core 
MQC(v). This core is then expanded into a module by adding new vertices that 
share many neighbors with MQC(v). If two modules overlap beyond a certain 
threshold, these modules are merged. The resultant collection of modules form the 
final set of predicted complexes. 



2.4. Methods incorporating functional information 

Proteins within complexes are generally enriched with same or similar functionJ^^. 
If the functional information for proteins from an organism are available, then this 
information can be combined with topological information from PPI networks for 
the reconstruction of complexes from the organism. One possible way to incorporate 
functional information is to score the interactions based on the functional similarity 
between the interacting pairs of proteins. Alternately, functional annotations (for 
example, from Gene OntologjS^) can be used to aid decisions where including or 
excluding a protein into complexes purely based on topological information might 
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be difficult. 




Gene Ontology information to detect complexes. The algorithm operates in two 
steps - it begins by clustering the PPI network and then filters the clusters based 
on cluster properties and functional homogeneity. 

The network G = (V, E) is first randomly partitioned into multiple subnet- 
works, which is essentially a partitioning of the node set V. The algorithm then 
iteratively moves nodes from one cluster to another in a randomized fashion till an 
integer-valued cost function is optimized. A common problem among such cluster- 
ing algorithms is the tendency to settle in poor local minima. To avoid this, the 
RNSC algorithm adopts diversification moves, which shuffle the clustering by occa- 
sionally dispersing the contents of a cluster at random. Once the clustering process 
is completed, clusters of small sizes or densities (the lower bound on cluster sizes 
and densities are experimentally determined) are discarded. Finally, a p-value is 
calculated using functional annotations (from GO) for each cluster that measures 
the functional homogeneity of the clusters. All clusters above a certain p-value are 
discarded to produce the final list of predicted complexes. Based on experiments, 
King et al. recommend cluster density cut-off of 0.70 and p- value cut-off of 10 -3 . 

Dense neighborhood Extraction using Connectivity and conFidence Features 
(DECAFF) 

Li ct al. (2007) 51 proposed the DECAFF algorithm which essentially is an extention 
of the LCMA algorithm^ proposed earlier by the same group. DECAFF identifies 
dense subgraphs in a neighborhood graph using a hub-removal algorithm. Local 
cliques are identified in these dense subgraphs and merged based on overlaps to 
produce clusters. Each cluster is assigned a functional reliability score, which is 
the average of the reliabilities of the edges within the cluster. All clusters with low 
reliabilities are discarded to produce the final set of predicted complexes. 



of methods because PCP uses a weighting scheme based on functional similarity 
(though the similarity is inferred from topology) to assign reliability scores to in- 
teractions, and then uses a clique merging strategy to detect complexes. 

2.5. Methods incorporating evolutionary information 

The increasing availability of PPI data from multiple species like yeast, fly, worm 
and some mammals has made it feasible to use insights from cross-species analysis 
for detection of (conserved) complexes. The assumption is that proteins belonging 
to real complexes should generally be conserved through the evolution process to 
act as an integrated functional uni P. 




earlier can also be categorized into this set 
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Sharan et al. proposed methods (2005-2007)^lMlfor detection of conserved com- 
plexes across species based on the evolution of PPI networks. In these methods, an 
orthology network (network alignment graph) is constructed from the PPI networks 
of different species, which essentially represents the orthologous proteins and their 
conserved interactions across the species. For a protein pair {ui,vi} in network G\ 
of species 5*1 and (u2, 1*2) in Gi of species S2, the orthology network G12 contains the 
pair {u,v} if U\ is orthologous to u 2 , and V\ is orthologous to v 2 - The edge (u,v) is 
weighted by the sequence similarities between the pairs {ui,v±}, and {1*2, v 2}. Any 
subgraph in this orthology network G\i is therefore a conserved subnetwork of G\ 
and G 2 ■ Such candidate subgraphs are then evaluated for parts of conserved com- 
plexes. Based on this idea, a tool QNe^^ was developed which returns conserved 
complexes from different species when queried using known complexes from yeast. 

2.6. Methods based on co-operative and exclusive interactions 

The overlapping binding interfaces in a protein may prevent multiple interactions 
involving these interfaces from occurring simultaneously^! In other words, the set 
of interactions in which a protein participates may be either co-operative or mutu- 
ally exclusive. The information about the co-occurrence or exclusiveness of inter- 
actions can therefore be useful for predicting complexes with higher accuracy. This 
information can be gathered from the interacting domains of protein pairs or the 
three-dimensional structures of the interacting surfaces. 

Ozawa et al. (2010)^1 proposed a refinement method over MCODE and MCL 
to filter predicted complexes based on exclusive and co-operative interactions. They 
used domain-domain interactions to identify conflicting pairs of protein interactions 
in order to include or exclude proteins within candidate complexes. Based on their 
results, the accuracies of predicted complexes from MCODE and MCL improved 
by two-fold. 

On the other hand, Jung et al. (2010 used structural interface data to con- 
struct a simultaneous PPI network (SPIN) containing only co-operative interac- 
tions and excluding competition from mutually exclusive interactions. MCODE 
and LCMA algorithms tested on this SPIN displayed a sizeable improvement in 
correctly predicted complexes. 

Even though incorporating information about co-operative and exclusive inter- 
actions shows promising improvement in complex detection algorithms, there are 
still several practical problems related to this approach. Gathering more data on 
conflicting interactions, especially based on three-dimensional structures of inter- 
faces, needs to be addressed before this approach can be more easily adopted. 

2.7. Incorporating other possible kinds of information 

In a recent foresightful survey by Przytycka et alPSl, the application of network 
dynamics (temporal information) into current computational analysis is discussed 
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at good lengths, especially with respect to detection of complexes and pathways 
from protein interaction networks. The authors suggest that if sufficient informa- 
tion about the 'timing activities' of proteins can be obtained, the dynamical nature 
of the underlying organizational principles in interaction networks can be better un- 
derstood. This shift from static to dynamic network analysis is vital to understand- 
ing several cellular processes, some of which may have been wrongly understood 
due to ignoring dynamic information. 



3. Comparative assessment of existing methods 

Considering the wide variety of proposed methods for complex detection, one can 

gauge the seriousness in the research effort towards computati onal ide ntification and 

categorization of complexes. Several surveys and experiment J25 | 26 |27J nave f ocuse d 

on the comparative analyses of these proposed methods for complex detection. Each 

new work on complex detection also comes with detailed comparative analyses of 

the new method with some earlier methods. However, due to the differences in PPI 

and benchmark datasets, evaluation criteria, thresholds and parameters used, and 

the subset of methods considered for these comparative assessments, different works 

arrive at different results on the relative performance of methods. But, typically the 

following broadly accepted criteria are used across the works. 

If a reasonably large 'gold standard' set of complexes is available (as in the case of 

yeast), the performance of a method can be gauged on how accurately its predicted 

complexes reconstruct or recover the 'gold standard'. Two commonly adopted mea- 

1041 

sures for this are precision and recalF 2 ^. Precision measures how many among the 
predicted complexes match some 'gold standard' complex, in turn measuring the 
proportion of realible predictions (accuracy) from the method. Recall measures how 
many of the 'gold standard' complexes are reconstructed by the method, in turn 
measuring the coverage or sensitivity of the method. Some methods tend to pro- 
duce too many (sometimes arbitrary) predictions resulting in high recall but very 
low precision, and therefore too many false positives to consider the method even 
reasonably reliable. To handle this, a combination of precision and recall, usually 
through a harmonic mean called F-measure, is used to evaluate how "balanced" is 
the method. 

On the other hand, if a 'gold standard' set is not available (as in the case mam- 
mals, currently), "self-evaluatory" measures like cluster cohesiveness and separabil- 
ity is usee p9 | 5 QI The cohesiveness of a predicted complex (cluster) usually measures 

topological characteristics of the cluster, for example, its interaction dens ity or size, 

fool 

while separability measures how separated is the cluster from others^ 1 . A combi- 
nation of cohesiveness and separability reveals how modular is the clustering and 
therefore how likely the individual clusters represent real complexes. 

Another typically independent way to evaluate the predictions is to measure the 
functional or co-localization coherenc e of th e clusters subjected, however, to avail- 
ability of appropriate annotation datW This captures how functionally coher- 
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ent are the proteins within a predicted complex and whether they are co-localized 
within the cell. The usual annotations required for these calculations are functions 
and sub-cellular localizations of the proteins. This evaluation is particularly useful 
for alternative validation of the predictions. 

Now, we present a summary of some representative surveys and comparative 
assessments and their conclusions. One of the first comprehensive assessments was 
performed by Brohee and van Helden (2006p3. They performed a detailed empir- 
ical comparison between MCODE^, MClP^I, RNSCP^ and Super-paramagnetic 
Clustering (SPCJ^. These algorithms were tested on PPI datasets from high- 
throughput experiments, and the resultant complexes were evaluated against bench- 
mark complexes from MIP^P. Additionally, the PPI datasets were introduced with 
artificial noise (random edge addition and deletion) to test the robustness of these 
algorithms. They concluded that MCL and RNSC outperformed MCODE and SPC 
in terms of precision (the proportion of correctly predicted complexes) and recall 
(the proportion of correctly derived benchmarks) . RNSC was robust to variation in 
its input parameter settings, while the performance of the other three varied widely 
for parameter changes. MCL was remarkably robust even upon introducing 80%- 
100% random noise. Overall, the experiments confirmed the general superiority of 
MCL over the other three algorithms. 

Vlasblom et al. (2009p^ compared MCL with another clustering algorithm, 
Affinity Propagation (APj^on unweighted as well as weighted PPI networks. The 
initial unweighted network was built from a set of 408 hand-curated complexes 
from Wodak latJ^l followed by random addition and removal of edges to mimic 
real PPI networks. The weighted network was o btaine d from the Collins et al.'s 
wor generated from Gavin and Krogan 

dataset J3ini They concluded that MCL 
performed considerably better than AP in terms of accuracy and separation of 
predicted clusters, and robustness to random noise. In particular, MCL was able to 
achieve about 90% accuracy and 80% separation compared to only 70% accuracy 
and 50% separation of AP on unweighted PPI networks with introduced random 
noise. MCL was able to discover benchmark complexes even at high (40%) noise 
levels. 

More recently (2010), Li et alP3 performed a detailed comparative evaluation 
of several algorithms: MCODE^I, MClPl, COREpl, COACEp!, RNScfSHl and 

decaffED. 

These algorithms were tested on PPI datasets from DlP^and Krogan 
et alP^. The DIP network consisted of 17203 interactions among 4930 proteins, 
while the Krogan dataset consisted of 14077 interactions among 3581 proteins. They 
used a total of 428 benchmark complexes from MIPS 60 , Aloy et al.^and SGD^. A 
cluster P from a met hod was considered a correct match to a benchmark complex B 
using the Baderscorj22l|ypnVB| 2 /(|Vp|.|yB|) > 0.20, where Vp denotes the number 
of proteins in P, and Vb denotes the number of proteins in B. Based on this criteria, 
the precision, recall and F-measure values were calculated. The comparisons of 
these algorithms is shown in Figured] (adapted from'^). The methods are arranged 
in chronological order, and it is interesting to note that over the years, the F- 
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Fig. 4. Comparative performance of complex detection methods in terms of precision, recall and 
F-measure on DIP and Krogan datasets. The methods are arranged in chronological order, and it 
is interesting to note that over the years, the F-measures have improved. 



measures have improved. Li et al. concluded that MCL, RNSC, CORE, COACH 
and DECAFF attained the best recall values. MCODE was able to achieve the 
highest precision, but it produced very few clusters resulting in very low recall. 



3.1. Our assessment of some complex detection methods 
3.1.1. Preparation of experimental data 

In our assessment, we experimentally evaluated some key complex detection meth- 
ods on both unscored (raw) and scored PPI networks. To build our unscored net- 
work, we combined the physical interaction from two TAP-MS experiments, Gavin 
et al. (2006) 9 and Krogan et al. (2006)W, which we call the Gavin+Krogan (G+K) 
network. We then gathered the scored version of this network, the Consolidated 
network from Collins et al. (2007^. This network comprises of interactions f rom 
Gavin et al. and Krogan et al. scored using the Purification Enrichment scheme^! 
Some of the properties of these networks are shown in Table 1. 

The benchmark (reference or 'gold standard') set of complexes was built f rom 
three independent sources: 408 complexes of the Wodak lab CYC2008 catalogue^! 
313 complexes of MIPS^, and 101 complexes curated by Aloy et al.^l The prop- 
erties of these reference sets are shown in Table 2. We considered each of these 
reference sets independently for the evaluation of the methods. We did not merge 
them into one comprehensive list of complexes because the individual complex com- 
positions are different across the three sources and some complexes may also get 
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PPI Network 


# Proteins 


# Interactions 


Avg node degree 


Gavin 


1430 


7592 


10.62 


Krogan 'Core' 


2708 


7123 


5.26 


Gavin+Krogan 


2964 


13507 


9.12 


Consolidated 


1622 


9704 


11.96 



Table 1. Properties of the PPI networks used for the evaluation of methods 



double-counted (because of different names used for the same complex). 









# Complexes of size 




Benchmark 


^Complexes 


# Proteins 


< 3 


3-10 


11-25 


> 25 


Avg density 


Wodak 


408 


1627 


172 


204 


27 


5 


0.639 


MIPS 


313 


1225 


106 


138 


42 


27 


0.412 


Aloy 


101 


630 


23 


58 


19 


1 


0.747 



Table 2. Properties of hand-curated (bona fide) yeast complexes from Wodak lab, MIPS and Aloy 



3.1.2. Metrics for evaluating the predicted complexes 

Let B = {Bi,B 2 , ...,B m } and C = {Ci,C-2, ...,C n } be the sets of benchmark and 
predicted complexes, respectively. We use the Jaccard coefficient J to quantify the 
overlap between a benchmark complex Bi and a predicted complex Cj\ 



We consider Bi to be covered by Cj, if J(B iy Cj) > overlap threshold t. In our 
experiments, we set the threshold t = 0.5, which requires \Bj C\Cj\ > L^iLiiSJ _ p or 
example, if \Bi\ = \Cj\ = 8, the overlap between Bi and Cj should be at least 6. 

We use previously reported^ definitions of recall (coverage) and precision (sen- 
sitivity) of the set of predicted complexes: 

Recall Rc = m\B, e B A3 Cj eC; JjB^) > t}\ ^ 

Here, € B A 3Cj G C; J(Bi,Cj) > t}\ gives the number of derived bench- 

marks. 



Precision Pr = MM eC A^eB^C;) >t}\ (g) 

1^1 
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Here, |{Cj|Cj e C A e B\ J(Bi, Cj) > t}\ gives the number of matched predic- 
tions. 

We calculated the F-measure as the harmonic mean of precision and recall, 

= 2*Pr*Rc 

Pr + Rc y ' 



3.1.3. Experimental evaluation of methods 

We considered the following methods for our evaluation: 

• On the unscored network, MCL (2002, 2004^301221 M CL-CA (2009 MCL- 
CAw (2010^3 CORE (2009^, COACH (2009^, CMC (2009^ and HACO 
(2009pS 

• On th e scor ed network, MCL (2002, 2004^201331 MCLO (2007^ MCL-CAw 
( 2010 j47|48| ) CMC ( 2009 j34] and HAC q ( 2 009pl 

We do not show comparisons for older methods like MCODE (2003 and 
RNSC (2004]P23 because these have been evaluated extensively in several ear- 
lier surveyj2^23, instead we included MCL as a benchmark in all our compar- 
isons si nce MCL has been repeatedly shown to perform better than these older 
method J25|26l27l . Further, not all methods are devised to make use of interaction 
confidence scores, and therefore we selected only the ones capable of doing so for 
the evaluations on the scored network. 

Table 3 shows the precision and recall values for methods evaluated on the un- 
scored Gavin+Krogan network across the three benchmark sets. The table shows 
that CORE, HACO and MCL-CAw performed significantly better in terms of recall 
compared to the rest of the methods. In particular, MCL-CAw performed consider- 
ably better than plain MCL indicating that incorporating core-attachment structure 
into raw MCL clusters helped to improve the accuracies of the predicted complexes. 
This indicated that incorporating some kind of biological knowledge helped to iden- 
tify complexes more accurately. 

Next, Table 4 shows these values for the methods evaluated on the scored Con- 
solidated network. This table shows that all methods were able to reconstruct sig- 
nificantly higher number of complexes upon scoring as compared to on the unscored 
network. This clearly indicated that noise in raw datasets (negatively) impacted the 
performance of methods, and reliability scoring aided in alleviating this impact and 
thereby improving the performance of methods. This demonstrated the effective- 
ness of current reliability scoring schemes in cleaning raw interaction datasets for 
focused studies such as complex detection. 



3.1.4. Plugging experimental results into our taxonomy 

We next "plu gged-in " these evaluation results as well as results obtained from some 
earlier surveyiP ^^ into our "bin-and-stack" classification, as shown in FigureO For 
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The unscored Gavin+Krogan network 
#Proteins 2964; interactions 13507 











Method 












MCL 


MCL-CA 


MCL-CAw 


COACH 


CORE 


CMC 


HACO 




#Predicted 


242 


219 


130 


447 


386 


113 


278 




#Matched 


55 


49 


69 


62 


83 


60 


78 


Wodak 


Precision 


0.226 


0.224 


0.531 


0.139 


0.215 


0.531 


0.281 


(#182) 


^Derived 


62 


49 


75 


49 


83 


60 


85 




Recall 


0.338 


0.269 


0.412 


0.269 


0.456 


0.330 


0.467 




#Matched 


35 


42 


42 


45 


59 


41 


45 


MIPS 


Precision 


0.143 


0.192 


0.323 


0.101 


0.153 


0.363 


0.162 


(#177) 


^Derived 


40 


42 


53 


38 


59 


41 


57 




Recall 


0.226 


0.237 


0.300 


0.215 


0.333 


0.232 


0.322 




#Matchcd 


43 


41 


47 


54 


59 


43 


59 


Aloy 


Precision 


0.179 


0.187 


0.362 


0.121 


0.153 


0.381 


0.212 


(#76) 


^Derived 


42 


41 


52 


37 


59 


43 


59 




Recall 


0.556 


0.539 


0.684 


0.487 


0.776 


0.566 


0.776 



Tabic 3. Comparisons between different methods on the unscored Gavin+Krogan network. CORE 
showed the best recall followed by HACO and MCL-CAw. 



The Consolidated^ .19 network 
#Proteins 1622; interactions 9704 











Method 










MCL 


MCLO 


MCL-CAw 


CMC 


HACO 




#Predicted 


116 


119 


130 


77 


101 




#Matched 


70 


80 


83 


67 


57 


Wodak 


Precision 


0.603 


0.672 


0.638 


0.870 


0.564 


(#145) 


^Derived 


79 


80 


90 


67 


64 




Recall 


0.545 


0.552 


0.621 


0.462 


0.441 




#Matched 


48 


65 


53 


56 


40 


MIPS 


Precision 


0.414 


0.546 


0.408 


0.727 


0.396 


(#157) 


^Derived 


63 


65 


67 


56 


57 




Recall 


0.401 


0.414 


0.427 


0.357 


0.363 




#Matchcd 


54 


56 


57 


45 


44 


Aloy 


Precision 


0.466 


0.471 


0.438 


0.584 


0.436 


(#76) 


^Derived 


55 


56 


55 


45 


45 




Recall 


0.724 


0.737 


0.724 


0.592 


0.592 



Table 4. Comparisons between the different methods on the Consolidated3.ig network. MCL-CAw 
showed the best recall followed by CMC. 

each method, we show the F-values before / after scoring, that is, on the unscored 
G+K network and the scored Consolidated network. Such a representation in our 
classification revealed two interesting insights, 

(1) incorporating biological information in addition to PPI topology improved per- 
formance of the methods (the F measures have increased in the vertical layers 
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compared to the lowest layer); 
(2) reliability scoring significantly improved performance of the methods, as shown 
by the before-after values. 

This representation also shows how complex detection methods have evolved over 
the years to improve performance, and therefore our taxonomy can be useful to 
guide future directions for further improvement. 
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Fig. 5. Plugging-in F-values (before-after scoring) of existing methods into our Bin-and-Stack 
classification. Incorporating biological information and affinity scoring significantly boosts perfor- 
mance. 



4. Open challenges in complex detection from PPI networks 

The review and evaluation of computational methods in the above sections reveal 
several critical challenges facing accurate identification of complexes from high- 
throughput interaction datasets. We saw that most methods are considerably im- 
pacted by noise in raw datasets. Further, most methods are able to reconstruct only 
a fraction of the known complexes (achieve at the most 65% recall) even upon scor- 
ing. This points towards some inherent limitations within the methods itself. On 
this basis, we broadly classify the challenges facing current methods into two cate- 
gories, (i) challenges originating from biological datasets; (ii) challenges originating 
from existing computational techniques. 

4.1. Challenges from interaction datasets 

Even though over the last few years, several independent high-throughput experi- 
ments have helped to catalogue enormous amount of interactions from yeast, they 
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show surprising lack of correlation with each other, and lack of coverage - bias 
towards high abundance proteins and against pro teins from c ertain cellular com- 
partments (like cell wall and plasma membrane j l2 | ll | 13 | 14 [ Also, each dataset 
still contains a substantial number of false positives (noise) that can compromise 
the utility of these datasets for more focused studies like complex detection, as 
seen from our evaluation results. In order to reduce the impact of such discrep- 
ancies, a number of data integrati on and reliability scoring schemes have been 
dcvi „ C( jl5ll6ll7ll8ll9l20l21l22l23l24l 

To overcome these challenges to some extent, in our evaluation, we combined 
multiple datasets (from Gavin et alP and Krogan et alP^J) to account for the lack 
of interaction coverage, and also adopted scoring prior to predicting complexes. 
In spite of these precautionary steps, we notice that most methods are able to 
reconstruct only a fraction of the known complexes, and we still have a long way 
to go towards identification of meaningful novel complexes through computational 
means. 



4.2. Challenges from existing complex detection methods 

As noted earlier, even though there have been numerous methods developed for 
complex detection, most of them suffer from low recall (at most 65% recall even on 
the scored network; Table 4). Even a "union" of these methods achieves at most 
70-75% recall on average across a variety of PPI datasetJ^. One of the crucial 
reasons for this limitation is that every method, in one way or another, relies on 
the key assumption that complexes are embedded among "dense" regions of the 
network. However, recent experiments have indicated that relying too much on this 
assumption in the wake of insufficient credible interaction data causes these methods 
to miss many complexes that are of low densities, that is, "sparse" in the networli^Sl. 
Therefore the need is to find alternative ways to model complexes than mere dense 
subnetworks, and also to compensate for the sparsity of topological information by 
augmenting other kinds of biological information. 



4.2.1. Detection of sparse complexes 

In the attempt to detect sparse complexes, the recent work by Srihari et al. (2012)P 
is insightful and can guide futher directions towards tackling this challenge. Srihari 
et al. (2012)p]noticed that most existing computational methods based on PPI net- 
works rely overly on the assumption that complexes are embedded among "dense" 
regions of the network, and therefore miss most of the "sparse" complexes that have 
very low interaction densities or lie disconnected in the network. These complexes 
are missed by current methods due to the lack of crucial topological information 
(missing interactions and/or proteins) - for example, even in the well-studied organ- 
ism yeast, only ~ 70% of the interactome has been validated and catalogued^. In 
order to overcome these "topological gaps" , the authors proposed careful augment- 
ing of functional interactions to PPI networks. Functional interactions represent 
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logical or conceptual associations among proteins, and therefore "encode" a variety 
of biological similarities or affinities among proteins beyond just physical interactiv- 
ity, thereby compensating for the lack of topological information in PPI networks. In 
order to do this augmentation systematically they proposed the SPARC algorithm. 

SPARC selects low quality clusters predicted from the physical (PPI) network 
using existing methods, and then selectively enhances these clusters to reconstruct 
(sparse) complexes. The key idea is that many of these low quality clusters are in 
fact fractions (or "pieces") of sparse complexes embedded in the PPI network, but 
due to missing interactions they lie "scattered" in the network and do not represent 
whole complexes in their current forms. If these clusters can be carefully enhanced 
by augmenting functional interactions, then several of the sparse complexes can 
be reconstructed accurately. This enhancement involves increasing their interaction 
densities and "pulling-in" together their disconnected components. However, during 
the selection of the initial set of low quality clusters, many may just represent noisy 
predictions (false positives). In order to determine the clusters that are more likely 
to represent complexes, SPARC makes use of a novel Component-Edge (CE) score. 
The CE-score is a topological measure combining connectivity and relative density 
of the clusters, and is shown to more accurately correlate with the topological 
characteristics of real complexes compared to traditionally accepted measures like 
edge density. The CE-score is calculated for every low quality cluster predicted from 
the PPI network. SPARC then augments functional interactions to the clusters and 
checks if their CE-scores improve beyond a certain threshold. If a cluster shows the 
required improvement, it is output as a potential candidate representing a sparse 
complex. Srihari et al. s howed through ex tens ive e xper iments on clusters produced 
from methods like MClPEl, MCI. C.\\\ 1; \ CMC 34 and HACC^ that SPARC 
was capable of improving the overall recall of these methods by upto 47% on average 
across a variety of networks. Specifically, SPARC helped to reconstruct 25% more 
complexes among the ones that were sparse. 

4.2.2. Detection of small and temporal complexes 

Small complexes (complexes of two or three proteins) also pose severe challenges in 
identification, particularly if PPI network topology is the only available information. 
In fact most complex detection methods based solely on the PPI networ k only at- 
tempt to identify complexes with at least 4 or more proteins in thenetworlPM. 
The attempt to predict small complexes (pairs or triplets) from the network based 
only on connectivity typically produces a significant fraction of false positives. Fur- 
ther, the smaller the size of a complex, the more prone it is to sparsity - missing 
even a few interactions can result in very low densities or render the complexes dis- 
connected. Due to these challenges, additional information apart from PPI network 
topology is required for detection of such c omp lexes. 

In a recent work (2012), Srihari et alP^ incorporated temporal information 
to identify small complexes. The authors focused on identifying small complexes 
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that are assembled during the yeast cell cycle in a temporal manner. They no- 
ticed that several high-degree proteins such as kinases interacted with different 
subsets of proteins during different phases of the yeast cell cycle to assemble multi- 
ple phase-specific complexes. However, due to the lack of temporal information to 
disambiguate (segregate) the different phase-specific complexes, existing topology- 
based methods produced large clusters of complexes fused together. Srihari et al. 
decomposed such large clusters by incorporating temporal information on the yeast 
cell cycle in which each protein showed peak expression, and thereby segregated 
the individual constituent complexes. Many of the segregated complexes were small 
and represented complexes of kinases and their temporal substrates. 

Srihari et al. also noticed that by incorporating such temporal information the 
"dynamics" of protein complexes could be better understood. They observed an 
interesting relationship between the "staticness " (c onstitutive expression) of a pro- 
tein and its participation in multiple complexes^-*- 1 - cells tend to maintain generic 
proteins as 'static' to enable their "reusability" across multiple temporal complexes. 



4.3. Challenges in detecting membrane complexes 

Membrane protein complexes are formed by physical interactions among membrane 
proteins. Membrane proteins are attached to or associated with the membranes of 
the cell or its organelles. Membrane proteins constitute approximately 30% of the 
proteomes of organisms, yet they are one of the least studied subsets of proteins. 
The study of membrane proteins and their complexes is crucial in understanding 



diseases and aiding new drug discoveries^. 

Membrane protein complexes are notoriously difficult to study using traditional 
high-throughput techniques like Y2H and TAP-MS^. This is due in part to the 
hydrophobic (insoluble) nature of membrane proteins, as well as the ready dissoci- 
ation of subunit interactions, either bet ween tr ansmembrane subunits or between 
transmembrane and cytoplasmic subumtsPlZOI. 

In order to counter the disadvantages of conventional techniques, new biochem- 
ical techniques have been developed recently to facilitate the characterization of 
interactions among membrane protein s. Among these is the split-ubiquitin mem- 
brane yeast two-hybrid (MYTH) systen r ^"^^H With the development of the high- 
throughput MYTH system, a fair number of interactions among mem bran e proteins 
have been recently catal ogued in species such as Arabidopsis thaliand^^ and yeast 
Saccharomyces cerevisia ^^ 

The identification of membrane complexes requires understanding their assem- 
bly - how the individual proteins come together to form complexes, and how these 
complexes are eventually degraded. This is because membrane proteins are not sta- 
ble entities as their soluble counterparts. Studies reveal that this assembly occurs in 
an orderly fashion, that is, membrane complexes are formed by an ordered assembly 
of intermediaries, and in order to prevent unwanted intermediaries, this assembly is 
highly aided by chaperones^l Many membrane complexes are formed by transient 
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interactions involving exchange of proteins in and out of existing complexes via 
membranes - a phenomenon called 'dynamic exchange ^41. The need therefore now 
is to develop sophisticated algorithms that take into account these aspects specific 
to membrane complexes to mine them effectively from membrane sub-interactomes. 

5. Conclusions 

Protein complexes are the fundamental functional units responsible for many bi- 
ological mechanisms within the cell. Their identification is therefore necessary to 
understand the cellular organization and machinery. The advent of high-throughput 
techniques for inferring protein interactions in a large-scale fashion has fueled devel- 
opment of computational techniques to systematically mine for potential complexes 
from the network of interactions. In this work, we reviewed, classified and evalu- 
ated some of the key computational methods developed till date for the detection 
of protein complexes from PPI networks. We presented two insightful taxonomies 
of existing methods - 'bin-and-stack' and 'tree'. From these taxonomies we note 
that scoring of raw interaction datasets (followed by filtering of false positives) and 
integrating key biological insights with topology can significantly improve complex 
prediction. 

Even though more than 20 methods have been developed over the years, complex 
detection still requires careful attention in handling errors and noise in experimental 
datasets, and reconstructing complexes with high accuracies. On this front, we iden- 
tified some of the crucial limitations and challenges facing current experimental and 
computational techniques. Interaction datasets coming from different experimental 
sources show surprising lack of correlation and also contain sizeable fraction of spu- 
rious (false positive) interactions. This severely impacts the accuracy and coverage 
of complex detection methods. Further, computational methods also overly rely on 
the assumption that complexes are embedded among densely connected groups of 
proteins, an assumption that is not fully valid in the wake of insufficient credible 
interactions. Finally, the interactions among membrane proteins have not been cat- 
alogued adequately making it difficult to identify an important group of complexes 
necessary for understanding diseases - membrane complexes. 

We hope that our review and assessment of computational methods as well 
as the challenges highlighted here will provide valuable insights to drive future 
research for further advancing the 'state-of-the-art' in computational prediction, 
characterization and analysis of protein complexes from organisms. 
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